the playing area that is fluoresced in the non-visible spectrum by the markers and is 
detectable by the first and / or second set of movable cameras. 


With respect to the Office Action, the following list outlines the sections of my response: 

1. A summary of major differences between my teachings and that of Sengupta; 

2. Review of two comparison figures drawn to highlight the differences between Sengupta and 
my teachings and claims; 

3. A discussion of the changes incorporated into new claims 97 through 124; especially how they 
differentiate my invention from the referenced art, and 

4. Summary arguments for the allowance of the revised claims. 

(1) A summary of the major differences between Sengupta and my teaching: 

Please note the following concerning the references provided supporting Aman in the table below. 
First, they are taken from two sources: 

• The original "Multiple Object Tracking System," U.S. Patent No. 6,567,116 Bl filed on Nov. 20, 
1998 and issued on May 20, 2003 - these references begin with "*" and are bolded; 

• The current specification entitled "Optimizations for Live Event, Real-time, 3D Object Tracking," 
which is a continuation in part of U.S. Patent No. 6,567,116 B 1 - these references begin with 


leaching 

Sengupta, | j || j ||g |||§ || 


A. Multi-camera 
"Figure /Centroid" 
Tracking System 

• 

Operates within predefined area; (Col 1: 12-15) 

• 

(same); (* Col 10: 42-44), (** pg. 6, l sl paragraph), 
(** pg. 46, paragraph 3) 

• 

Operates during predefined time; (Col 1: 25-28) 

• 

(same); (* Col 15: 47-51), (* Col 20: 61-65), (** pg. 
10, paragraph 2) 


• 
• 

No specified or inherent grouping of cameras; 
(Col 3: 8-11) 

Any cameras may be either fixed or movable; 
(Col 3: 8-11), (Col 4: 55-59) 

• 
• 

• 

Two specific and distinct sets of cameras; 

1 st Set is all fixed cameras for X, Y tracking; (* Col 1 
10: 42-47), (** pg. 46, paragraph 3) 

2 nd Set is all movable cameras for game videoing 
(/filming); (* Col 10: 66-Col 11: 4), (** pg. 46, 
paragraph 4 - pg. 47, paragraph 1) \ 


• 

No requirement that camera views overlap to 
completely cover predefined area; (Col 2: 27- 
35), (Col 7: 53-56), (Col 8: 36-40), (Col 10: 3-7) 

• 

I s1 Set of Cameras must have overlapping views that 
completely cover predefined area; (* Col 13: 63 - Col 
14: 10), (pg. 46, paragraphs 2-3) 

. .. 

• 

All cameras perform the same basic figure X, Y 
centroid tracking; (Col 4: 35-45); (Col 5: 58-62) 

• 
• 

1 st Set of Cameras exclusively for figure X, Y \ 
centroid tracking; (* Col 15: 36-40), (* Col 15: 47- 
51), (* Col 18: 13-18), (pg. 46, paragraphs 2-3) 

2 nd Set of Cameras is for Videoing and is not required 
for figure X, Y centroid tracking; (* Col 13: 41-44), 
(pg. 46, paragraph 4) 
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• There is an implication that the multiple 

videoing / tracking cameras are at a perspective 
view, thus tracking the figures centroid is made 
more difficult and requires either triangulation, 
based upon two overlapping camera views or 
ranging, based upon dynamically adjusting the 
camera's focus to estimate the distance from the 
camera to the figure; (Col 6: 15-25) 

• The l sl Set of Cameras is able to determine X, Y 
Centroid with sufficient accuracy since the set 
maintains a view the is substantially parallel with the 
tracking surface on which the figures are moving 
about, therefore triangulation is not necessary for two 
dimensional calculations; (* Col 13: 63 - Col 14: 4), 
(** pg. 46, paragraphs 2-3) 

• Ranging via dynamic focus adjustments is time 
consuming and too slow for real-time sports tracking, 
especially in light of some sports objects traveling at 
speeds in excess of 100 miles per hour; (* Col 15: 61 
-Col 16:8) 

• Alternatively to triangulation and ranging, 
figure centroids can be calculated if assumptions 
are made about physical size, again this is 
because the cameras are placed at a perspective 
to the figure; (Col 6: 65- Col 7:4) 

• Still further, by intentionally placing the camera 
at a perspective and known angle to the tracking 
surface, this angle along with the detected 
vertical position of the figure in the image can 
be used to estimate the figures X, Y centroid; 
(Col 7: 4-8) 

• Very often players in a sporting contest will fall down 
or at least bend over, thus significantly altering the 
apparent size from a perspective view, by switching to 
a primarily overhead view, these changes in player 
posture become less important and tend to dominate 
the Z (height) dimensions rather than the X, Y 
centroid; 

• Placing the tracking cameras at a perspective angle 
with the tracking area makes them highly susceptible 
to figure / player occlusions where one player is 
blocking the view of another, thus defeating a 
perspective based tracking technique; 

• All cameras function independently and 

discontinuously depending upon the current X, 
Y centroid location of the figure being tracked - 
i.e. each camera only functions when the system 
determines that the figure being tracking is 
within its potential view; (Col 9: 14-16), (Col 9: 
53-Col 10: 3); (Col 10: 7-11) 

• Each and every camera in the 1 st Set of Cameras 
continuously look for figures throughout the entire 
predefined time, regardless of the current X, Y 
centroid location - i.e. regardless of whether or not the 
figure is in that cameras current view; (* Col 15: 47- j 
59), (* Col 16: 12-23), (** pg. 16, paragraph 2) 

B. Figure Tracking 
Algorithm 

• Multiple figures may be moving throughout the 
entire predefined area, throughout the entire 
predefined time; (Col 4: 22-23) 

• (same); (* Col 15: 36-42), (** pg. 16, paragraph 2) 

• Of all the multiple figures, only a selected figure 
of interest is tracked and videoed, all other non- 
selected figures are intentionally ignored; (Col 
4: 10-13); (Col 4: 16-19); (Col 4: 35-45) 

• All of the multiple figures moving within the 

predefined area are tracked; (* Col 15: 36-42), (** pg. 
18, paragraph 2), (** pg. 13, paragraph 2), (** pg. 16, 
paragraph 2) 

• Since there is no requirement that the entire 
predefined area be in view of at least one 
camera at all times, figure tracking can only be 
initiated when the desired figure appears in a 
first camera; (Col 4: 3-10) 

• Figure tracking is automatically initiated for each 
figure as soon as it enters any portion of the 
predefined area (and is therefore now in view of at 
least one camera in the 1 st Set of Cameras); (* Col 15: 
47-51), (** pg. 51, paragraph 3) 

• Since there is no requirement that the entire 
predefined area be in view of at least one 
camera at all times there may be some locations 
where figure movement must be estimated / 
predicted because the figure is not in view to be 
tracked; (Col 7: 57-60), (Col 8: 8-14), (Col 8: 
36-40), (Col 10: 3-7) 

• Since the 1 st Set of Cameras is specifically limited to a 
contiguous formation completely covering the 
predefined area, all figures are within the view of at 
least one camera at all times; (* Col 14: 3-10), (** pg. 
46, paragraphs 2-3) 
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• 

• The figure of interest may be automatically 
detected and manually selected, for tracking, 
which is the preferred approach in a "busy 
scene" (such as a sporting event); (Col 4: 3-7), 
(Col 4: 20-25), (Col 7: 36-41) 

• All figures are of interest and are all detected 
automatically and therefore also automatically 
selected for tracking, including within a busy scene; (* 
Col 18: 55 - Col 19: 17), (* Col 20: 61-65) , (** pg. 
16, paragraph 2), (** pg. 47, paragraph 3) 

• The figure of interest may be automatically 
detected automatically selected, based upon 
"detectable differences in the particular target 
profile such as size, shape, speed, etc."; (Col 4: 
8-10) 

• At least with sports, there is often no realistic means 
for differentiating between targets (players) based 
upon size, shape, speed, etc. 

• All figures are always tracked and in a sporting event, 
they are differentiated by uniquely encoded markings; 

/* 1*1*1 -i f» /•» £»\ 111 &a f 1 \ /* d~> lift 1 A 

(* Col 11: 15-25), (* Col 11: 58-61), (* Col 19: 14- 

16), (** Fig.s 1-12, elements 560, 574), (** pg. 7, 
paragraph 2), (** pg. 31, paragraph 2) ), (** pg. 34, 
paragraph 3 - pg. 35, paragraph 1), (** pg. 49, 
paragraph 3), (** pg. 50, paragraph 4 - pg. 51, 
paragraph 1) j 

• The figure tracking algorithm includes a 
"determinator" whose responsibility is to 
determine the location of the figure within the 
current camera's field of view and the camera's 
physical location within the predefined area; 
(Col 3: 54-60) 

• There is no equivalent determinator required since all 
cameras in the entire I st Set of Cameras are always 
searching their fields-of-view for any and all figures 
only with respect to their current view, where this 
information is then passed to the tracking algorithm 
that combines these detected centroids from all l sl 
cameras in order to determine every figure's current 
physical location within the predefined area; (* Col 
13: 21-25), (** pg. 16, paragraph 2), (** pg. 31, 
paragraph 2) 

• The figure tracking algorithm next uses a 
"controller" to determine which cameras' fields 
of view include the figure's physical location 
and then select the appropriate next camera to 
"handoff ' videoing of the figuring to, and 
therefore also handing of the responsibility of 
figure tracking; (Col 3:60-65), (Col 4: 27-45), 
(Col 7: 42-52) 

• There is no equivalent "controller" that is selecting 
between cameras in the 1 st Set of Cameras to 
determine which camera's view currently includes a 
selected figure, so as to then "handoff' responsibility 
for videoing that figure to the selected 1 st Set camera - 
again, this is because all 1 st Set cameras are 
continuously tracking, regardless of whether they j 
include a selected figure, let alone any figure; (* Col 
13: 21-25), (** pg. 16, paragraph 2) , (** pg. 31, 
paragraph 2) 

C. Multi-camera 
Videoing System 

■■jp ■ •-• . ■ ■ .;. • 

• There is no distinction between tracking 
cameras and videoing cameras, therefore any 
camera that is tracking figure centroids may also 
video and vice versa, in the end every camera 
may at some point be a part of the figure 
centroid tracking function; (Col 4: 10-13) 

• However, since only those cameras with a view 
of a selected figure are inputting data to the 
figure tracking system, then only those cameras 
are also creating video for the system; 

• One or more moveable cameras form a 2 nd Set of 
Cameras, entirely distinct from the 1 st Set of Cameras 
that have no responsibility for figure centroid tracking 
(all of which is done with the 1 st Set of fixed 
Cameras); (* Col 10: 66 - Col: 11: 4), (* Col 13: 41- 
44), (* Col 15: 19-30), (Fig. 8 - element 40 for 
videoing vs. element 20c for X, Y tracking), (* Col 
21: 32-34), (* Col 22: 4-6), (** pg. 31, paragraph 2), 
(** pg. 34, paragraph 3 - pg. 35, paragraph 1) 

• Capturing video of the entire predefined area leads to 
the ability to combine all sources of video into a single 
composite view - therefore, video of an current 
"figure empty" portion of a sports playing field is also 
of value because is holds a place with respect to all 
other views; 
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• Since only the centroid of the selected figure is 
being tracked, the system has insufficient 
information for alternating between possible 
movable cameras to direct "best-views" of a 
selected figure, consequently an operator is 
relied upon to pick from multiple possible 
views; (Col 7: 36-52) 


Since the centroids for all figures are always tracking, 
the system uses this information to dynamically adjust 
which 2 nd cameras are assigned to video which 
figures; (* Col 13: 41-44), (* Col 15: 19-30), (** pg. 
31, paragraph 2 ), (** pg. 34, paragraph 3 - pg. 35, 
paragraph 1) 


D. "Figure feature / 
location point" 
Tracking System 


• There is no teaching presented to allow for the 
determination of a three-dimensional body-point 
model of the selected figure, let alone all figures 
within the predefined area; 


Each of the multiple movable cameras in the 2 nd Set of 
Cameras may be used to gather additional video data 
that is searchable for specific locations on each 
tracked figure, where the locations may be tracked in 
three dimensions and provide a model of each figure; 
(* Col 11: 15-28), (** pg. 31, paragraph 2) ), (** pg. 
34, paragraph 3 - pg. 35, paragraph 1), (** pg. 46, 
paragraph 4 - pg. 47, paragraph 1), (** pg. 50, 
paragraph 2) 


There is no teaching presented for adhering 
markers to key body joints of the figures to be 
tracked in order to support the creation of the 
three dimensional body-point model; 


• One or more location on each figure may be marked 
with reflective, retro-reflective or fluorescent markers 
that are used to more efficiently detect key body- 
points for forming the three-dimensional model; (** 
Fig.s 1-12, elements 530. 540, 550), (** pg. 6, 
paragraph 2), (** pg. 6, paragraph 5), (** pg. 48, 
paragraph 2), (** pg. 49, paragraph 3) 

• The markers may operate either within the visible or 
non-visible spectrum; (** Fig.s 1-12, elements 510, 
520), (** pg. 48, paragraph 3), (** pg. 49, paragraph 
3) 


E "Figure 

Identification" method 
to combine with Figure 
Tracking 


There is no teaching presented to allow for the 
unique determination of the selected figures 
identity (e.g. "player 22"); 



Each figure may be marked with an encoded marker 
that is detectable within at least the 1 st Set of Camera 
views and can be translated into each tracked figures 
unique identity to be matched with their movements; 
(* Col 11: 58-64), (* Col: 21: 17-20), (* Col 22: 7- 
11) - (** Fig.s 1-12, elements 560, 574), (** pg. 7, 
paragraph 2), (** pg. 31, paragraph 2) ), (** pg. 34, 
paragraph 3 - pg. 35, paragraph 1), (** pg. 49, 
paragraph 3), (** pg. 50, paragraph 4 - pg. 51, 
paragraph 1) 


There is no teaching presented for adhering 
encoded markers to the upper surfaces of the 
figures to be tracked in order to support the 
automatic determination of their identity to be 
associated with their current centroid location; 


• Each figure may be marked with a uniquely encoded 
reflective, retro-reflective or fluorescent marker on 
their upper surface to be primarily in view of the 1 st 
Set of Cameras, where the each marker may be more 
easily detected and decoded into that figures unique 
identity for association with their current centroid; (** 
Fig.s 1-12, elements 560, 574), (** pg. 48, paragraph 
2), (** pg. 49, paragraph 3) 

• The markers may operate either within the visible or 
non-visible spectrum; (** Fig.s 1-12, elements 510, 
520), (** pg. 7, paragraph 4), (** pg. 48, paragraph 3), 
(** pg. 49, paragraph 3) 
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(2) Review of two comparison figures drawn to highlight the differences between Sengupta and my 
teachings and claims: 

Attached please find the following two Figures drawn to highlight and summarize the differences 
between Sengupta and my teachings, as follows: 

1. "Sengupta et. al": This figure depicts a "Figure Tracking System" attached via controlled 
switches to multiple fixed and / or moveable cameras that cover some portion, but not 
necessarily all of, a given predefined tracking area. Hence, the entire view created by all 
possible tracking cameras is neither contiguous nor is it limited to cover the entire predefined 
area. Also shown is a figure (in this case a hockey player) that is moving throughout the 
predefined area; sometimes in view of one or more cameras, sometimes not in view. As the 
player moves about, the Figure Tracking System e.g. begins tracking by use of fixed camera 1, 
after which tracking is "handed off' in succession to movable camera 2, fixed camera 3, 
(player then enters hole in view), fixed camera 4 and finally movable camera 5. At any given 
moment, the Figure Tracking System receives video data from a single camera (e.g. 1 through 
5) that it then uses for image analysis and resulting figure centroid determination. Hence, each 
camera is not continuously used for tracking, but rather only when a selected figure of interest 
is first determined to be entering or existing within its potential field of view. 

2. "Aman": This figure depicts a "Figure Tracking System" attached to a matrix of two or more 
fixed overhead X, Y tracking cameras that are limited to form a contiguous view of the entire 
predefined are. Also shown is the same figure of a hockey player that is moving throughout the 
predefined area and is always in view of the overhead tracking cameras. The Figure Tracking 
System continuously receives video from each two or more overhead tracking cameras for 
analysis throughout the entire predefined tracking time, regardless of whether or not any 
particular figure / player is within its view. Hence, the overhead tracking cameras are always 
being processed and always tracking all players, not simply a selected player. One or more 
movable side view cameras may also be dynamically adjusted to continuously follow the 
figure or any tracked object. Adjustable side view cameras may also be used for additional 
analysis by the Figure Tracking System; not for determination of each figure / player / objects 
X, Y centroid location with respect to the predefined area, but rather for the determination of 
the current X, Y, Z coordinates of one or more locations / marks on each figure / player. 
Although not depicted, each player may also be wearing an uniquely encoded marker on some 
portion of their upper surface, such as their shoulders or helmet, that may be detected and 
decoded by the Figure Tracking System using the video captured by the overhead tracking 
system. This additional information thereby allows the Figure Tracking System to include 
unique figure / player identity along with the current location / tracking database. 

(3) A discussion of the changes incorporated into new claims 97 through 124; especially how they 
differentiate my invention from the referenced art: 

With respect to these new claims, there are two independent apparatus claims 97 and 107 along with 
similar independent method claims 111 and 121, respectively. Attention will now be paid therefore to 
new apparatus claims 97 and 107 with the understanding that the discussions are similarly applicable to 
new method claims 111 and 121 respectively. First, in relation to claim 97, it is most directly 
comparable to claim prior cancelled claim 61. In a comparison between original claim 61 and new 
claim 97, especially in light of the recent office action and the prior reference of Sengupta, the 
following limitations have been added to the new claim 97: 
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1. The first set of cameras is restricted to "two or more," rather than allowing just one stationary 
X, Y tracking camera; 

2. It is explicitly claimed that the first set of cameras alone is "exclusively responsible" for 
ongoing centroid X, Y tracking for each participant (i.e. / player / figure / object) throughout 
the entire time of tracking; 

3. It is explicitly claimed that the "current centroid location" of any tracked participant (i.e. / 
player / figure / object) does not effect each first camera from continuously capturing video for 
continuous analysis (hence, when a player moves out of one 1 st camera's view and into 
another, both cameras remain in full operation); 

4. It is explicitly claimed that the first algorithm is "simultaneously analyzing" the video data 
from all two or more first tracking cameras, rather than possibly just one camera at a time, and 
further that it is analyzing "continuous images from each first camera" (i.e. every 1 st camera is 
always on and always being used for tracking, regardless of the presence or not of a participant 
or object within a given 1 st camera's field of view); 

5. It is explicitly claimed that first algorithm creates the tracking database "continuously 
throughout the predefined time" by combining the ongoing "centroid X, Y coordinates" 
determined from "each and every first set camera"; 

6. It is explicitly claimed that the second set of movable cameras is "distinct from the first set of 
stationary cameras," and 

7. It is explicitly claimed that the video images from the second set of movable cameras are "not 
used to either determine any participant's or object's centroid X, Y coordinates or to otherwise 
update the tracking database." 

Second, in relation to claim 107, it is not directly comparable to any original claim 50 through 96 
primarily because it is crafted more as a picture claim, thereby incorporating some of the original 
dependents (such as 66, covering uniquely identifying participants via encoded markings) into the 
tracking portion of the independent apparatus of claim 97. Especially in light of the recent office action 
and the prior reference of Sengupta, the following limitations have been added to the new claim 107: 

1. The system is for "automatically uniquely identifying and tracking one or more participants. . . 
and game objects," where the unique identity is determined from "encoded markers adhered to 
the top surface of each participant"; 

2. The first set of cameras is restricted to "two or more," rather than allowing just one stationary 
X, Y tracking camera; 

3. The first set of cameras are by implication "exclusively responsible" for ongoing centroid X, Y 
tracking for each participant (i.e. / player / figure / object) throughout the entire time duration 
of tracking, simply because no other cameras (i.e. side view cameras) are included in the 
claim; 

4. It is explicitly claimed that the first algorithm is "simultaneously analyzing" the video data 
from all two or more first tracking cameras, rather than possibly just one camera at a time, and 
further that it is analyzing "continous images from each first camera" (i.e. every 1 st camera is 
always on and always being used for tracking, regardless of the presence or not of a participant 
or object within a given 1 st cameras field of view), and 

5. It is explicitly claimed that first algorithm creates the tracking database "continuously 
throughout the predefined time" by combining the ongoing "centroid X, Y coordinates" along 
with the "determined each participant's and / or object's unique identity" (based upon the 
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"encoded markers adhered onto a top surface") using data (exclusively by implication) from 
"each and every first set camera." 


(4) Summary arguments for the allowance of the revised claims: 

As it can be seen, the teachings of my original application for a "Multiple Object Tracking System" as 
continued in part with the present application for "Optimizations for Live Event, Real-time, 3D Object 
Tracking," are clearly different from all of the sited art, especially Sengupta. In summary, the 
differences include, but are not limited to: 

1. Two distinct sets of cameras - the first exclusively for figure centroid tracking and the second 
for videoing without contributing tot centroid tracking', 

2. The first set of cameras is restricted to covering the entire predefined tracking area in order to 
effectively support the claimed multiple object tracking as would, for example, be 
representative of the needs of a sporting event (as opposed to simply following a figure as it 
walks around a building); 

3. Each camera in the first set is always tracking, regardless of whether or not any given figure 
currently exists within its view - this helps to simplify the complexity of working with a larger 
number of first tracking cameras, especially when a large number of fast moving and bunching 
figures are being tracked as would be the case in a sporting event; 

4. Uniquely encoded marks are placed on the upper surfaces of figures / participants / players / 
objects to be identified such that this mark can be picked up within the view of the first 
cameras; thus allowing the first set of cameras to perform the data collection necessary for both 
player tracking and identification, and 

5. The second set of movable cameras need not provide any data to any figure tracking system or 
its equivalent and can be restricted to simply gathering videoing of the movement of the 
figures. 

In conclusion, I respectfully request that you allow new claims 97 through 124 as submitted as they 
clearly differentiate my original teachings contained within U.S. Patent No. 6,567,116 Bl filed on Nov. 
20, 1998 entitled "Multiple Object Tracking System," as well as the current continuation of this original 
application entitled "Optimizations for live Event, Real-time, 3D Object Tracking." Furthermore, in 
light of the full support for both independent apparatus claims 97 and 107 as well as parallel 
independent method claims 111 and 121 respectively, I ask that the invention date for the allowed 
claims be set as the original filing date of Nov. 20, 1998, based upon U.S. Patent No. 6,567,116 Bl 
filed on entitled "Multiple Object Tracking System." 

I thank you for your consideration in these matters. 


Sincerely, 



This communication was mailed Post Office To Addressee from Lansdale, Pennsylvania on 
using label HH^l 05 Examiner Senfi, Behrooz on 12/91706. 
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